SwiftTuna: Incrementally Exploring Large-scale Multidimensional Data
نویسندگان
چکیده
The advance in distributed computing technologies opens up new possibilities of data exploration even for datasets with a few billion entries. In this paper, we present SwiftTuna, an interactive system that brings in modern cluster computing technologies (i.e., inmemory computing) to InfoVis, allowing rapid and incremental exploration of large-scale multidimensional data without building precomputed data structures (e.g., data cubes). Our performance evaluation demonstrates that SwiftTuna enables data exploration of a real-world dataset with four billion records while preserving the latency between incremental responses within a few seconds.
منابع مشابه
An Efficient Encoding Scheme to Handle the Address Space Overflow for Large Multidimensional Arrays
We present a new implementation scheme of multidimensional array for handling large scale high dimensional datasets that grows incrementally. The scheme implements a dynamic multidimensional extendible array employing a set of two dimensional extendible arrays. The multidimensional arrays provide many advantages but it has some problems as well. The Traditional Multidimensional array is not dyn...
متن کاملMining Clickstream-Based Data Cubes
Clickstream analysis can reveal usage patterns on company’s web sites giving highly improved understanding of customer behaviour. This can be used to improve customer satisfaction with the website and the company in general, yielding a great business advantage. Such information has to be extracted from very large collections of clickstreams in web sites. This is challenging data mining, both in...
متن کاملExploring Scientific Discovery with Large-Scale Parallel Scripting
Scientists and the organizations that fund scientific research frequently face difficult questions about how to allocate scarce resources. Should they pursue safe avenues of investigation that incrementally extend current knowledge? Or should they pursue ideas that are far off the beaten track, which are less likely to bear fruit, but more likely to provide revolutionary insights? One group at ...
متن کاملExploring Industrial Data Repositories: Where Software Development Approaches Meet
Lots of data are gathered during the lifetime of a product or project in different data repositories that may be part of a measurement program or not. Analyzing this data is useful in exploring relations, verifying hypotheses or theories, and in evaluating and improving companies’ data collection systems. The paper presents a method for exploring industrial data repositories in empirical resear...
متن کاملPath Planning with Incremental Roadmap Update for Large Environments
Recent research results suggest that one can incorporate motion-planning techniques into the control loop of 3D navigation or tele-operation for more efficient navigation. However, the motion planner with this approach may not scale up well for large workspaces. In this paper, we propose a novel approach to overcome this scalability problem. We limit the region of interest for path-finding to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016